Sketching Earth-Mover Distance on Graph Metrics

نویسندگان

  • Andrew McGregor
  • Daniel Stubbs
چکیده

We develop linear sketches for estimating the Earth-Mover distance between two point sets, i.e., the cost of the minimum weight matching between the points according to some metric. While Euclidean distance and Edit distance are natural measures for vectors and strings respectively, Earth-Mover distance is a well-studied measure that is natural in the context of visual or metric data. Our work considers the case where the points are located at the nodes of an implicit graph and define the distance between two points as the length of the shortest path between these points. We first improve and simplify an existing result by Brody et al. [4] for the case where the graph is a cycle. We then generalize our results to arbitrary graph metrics. Our approach is to recast the problem of estimating Earth-Mover distance in terms of an `1 regression problem. The resulting linear sketches also yield space-efficient data stream algorithms in the usual way.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On constant factor approximation for earth mover distance over doubling metrics

Given a metric space (X, dX), the earth mover distance between two distributions over X is defined as the minimum cost of a bipartite matching between the two distributions. The doubling dimension of a metric (X, dX) is the smallest value α such that every ball in X can be covered by 2 ball of half the radius. A metric (or a sequence of metrics) is called doubling precisely if its doubling dime...

متن کامل

Impossibility of Sketching of the 3D Transportation Metric with Quadratic Cost

Transportation cost metrics, also known as the Wasserstein distances Wp, are a natural choice for defining distances between two pointsets, or distributions, and have been applied in numerous fields. From the computational perspective, there has been an intensive research effort for understanding the Wp metrics over R, with work on the W1 metric (a.k.a earth mover distance) being most successfu...

متن کامل

Improved Approximation Algorithms for Earth-Mover Distance in Data Streams

For two multisets S and T of points in [∆], such that |S| = |T | = n, the earth-mover distance (EMD) between S and T is the minimum cost of a perfect bipartite matching with edges between points in S and T , i.e., EMD(S, T ) = minπ:S→T ∑ a∈S ||a−π(a)||1, where π ranges over all one-to-one mappings. The sketching complexity of approximating earth-mover distance in the two-dimensional grid is men...

متن کامل

Rademacher-Sketch: A Dimensionality-Reducing Embedding for Sum-Product Norms, with an Application to Earth-Mover Distance

Consider a sum-product normed space, i.e. a space of the form Y = `1 ⊗ X , where X is another normed space. Each element in Y consists of a length-n vector of elements in X , and the norm of an element in Y is the sum of the norms of its coordinates. In this paper we show a constant-distortion embedding from the normed space `1 ⊗X into a lower-dimensional normed space ` ′ 1 ⊗ X , where n′ n is ...

متن کامل

Earth Mover ’ s Distance and Equivalent Metrics for Spaces with Hierarchical Partition trees

define four metrics between probability measures on a space equipped with a hierarchical partition tree, and prove their equivalence. Similar metrics have previously been defined in more restrictive settings; in particular, the well-known Earth Mover's Distance is widely used in machine learning. We adapt the definitions of these metrics to a much broader class of geometries, and use machinery ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013